2

1

Introduction

and energy, which, like information, is another fundamental, irreducible concept. 1

Although the doctoral thesis of Shannon, one of the fathers of information theory, was

entitled “An algebra for theoretical genetics”, apart from genetics, biology remained

largely untouched by developments in information science.

One might speculate on why information was placed so firmly at the core of

molecular biology by one of its pioneers. During the preceding decade, there had

been tremendous advances in the theory of communication—the science of the trans-

mission of information. Shannon published his seminal paper on the mathematical

theory of communication only a few years before Watson and Crick’s work. In that

context, the notion of a sequence of DNA bases as message with meaning seemed

only natural, and the next major development—the establishment of the genetic code

with which the DNA sequence could be transformed into a protein sequence—was

cast very much in the language and concepts of communication theory. More puzzling

is that there was not subsequently a more vigorous interchange between the two dis-

ciplines. Probably the lack of extensive datasets and of powerful computers, which

made the necessary calculations intolerably tedious, or simply too long, provides

sufficient explanation for this neglect—and hence, now that both these requirements

(datasets and powerful computers) are being met, it is not surprising that there is

a great revival in the application of information ideas to biology. One may indeed

hope that this revival will at last lead to a real answer being advanced in response to

the vital question “what is life?” In other words, information science is perhaps the

missing discipline that, along with the physics and chemistry already being brought

to bear, is needed to answer the question.

1.1

What is Bioinformatics?

The term “bioinformatics” seems to have been first used in the mid-1980s in order to

describe the application of information science and technology in the life sciences.

The definition was at that time very general, covering everything from robotics to

artificial intelligence. Later, bioinformatics came to be somewhat prosaically defined

as “the use of computers to retrieve, process, analyse, and simulate biological infor-

mation”. An even narrower definition was “the application of information technology

to the management of biological data”. Such definitions fail to capture the centrality

of information in biology. If, indeed, information is the most fundamental concept

underlying biology and bioinformatics is the exploration of all the ramifications and

implications of that basis, then bioinformatics is excellently positioned to revive

consideration of the central question “what is life?” A more appropriate definition

of bioinformatics is, therefore, “the science of how information is generated, trans-

1 The two are, of course, intimately related. Energy may be needed to produce information and, as

Szilard showed in his exorcism of Maxwell’s demon, the judicious use of information can produce

energy.